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Transmissible gastroenteritis virus (TGEV) is an enteropathogenic coronavirus isolated for the first time in 1946. 
Nonenteropathogenic porcine respiratory coronaviruses (PRCVs) have been derived from TGEV. The genetic relation- 
ship among six European PRCVs and five coronaviruses of the TGEV antigenic cluster has been determined based on 
their RNA sequences. The S protein of six PRCVs have an identical deletion of 224 amino acids starting at position 21. 
The deleted area includes the antigenic sites C and B of TGEV S glycoprotein. Interestingly, two viruses (NEB72 and 
TOY56) with respiratory tropism have S proteins with a size similar to the enteric viruses. NEB72 and TOY56 viruses 
have in the S protein 2 and 15 specific amino acid differences with the enteric viruses. Four of the residues changed (aa 
219 of NEB72 isolate and aa 92, 94, and 218 of TOY56) are located within the deletion present in the PRCVs and may be 
involved in the receptor binding site (RBS) conferring enteric tropism to TGEVs. A second RBS used by the virus to 
infect ST cells might be located in a conserved area between sites A and D of the S glycoprotein, since monoclonal 
antibodies specific for these sites inhibit the binding of the virus to ST cells. An evolutionary tree relating 13 enteric and 
respiratory isolates has been proposed. According to this tree, a main virus lineage evolved from a recent progenitor 
virus which was circulating around 1941. From this, secondary lineages originated PUR46, NEB72, TOY56, MIL65, 
BRI70, and the PRCVs, in this order. Least squares estimation of the origin of TGEV-related coronaviruses showed a 
significant constancy in the fixation of mutations with time, that is, the existence of a well-defined molecular clock. A 
mutation fixation rate of 7 + 2 x 107+ nucleotide substitutions per site and per year was calculated for TGEV-related 
viruses. This rate falls in the range reported for other RNA viruses. Point mutations and probably recombination events 


have occurred during TGEV evolution. 


INTRODUCTION 


Transmissible gastroenteritis virus (TGEV) belongs 
to one of the two major antigenic groups of mammalian 
coronaviruses (Siddell et a/., 1982; Spaan et a/., 1988). 
The virus was first isolated in 1946 (Cox et a/., 1990a; 
Doyle and Hutchings, 1946). It is an enteropathogenic 
coronavirus which replicates in both villus epithelial 
cells of the small intestine and in lung cells. In 1984, a 
nonenteropathogenic virus related to TGEV, the por- 
cine respiratory coronavirus (PRCV) appeared in Eu- 
rope (Pensaert et a/., 1986; Callebaut et a/., 1988). This 
virus replicates to high titers in the respiratory tract and 
undergoes only limited replication in unidentified sub- 
mucosal cell types of the small intestine (Cox et al., 
1990a,b). A virus similar to the European PRCV has 
been recently described in North America (Wesley et 
al., 1990b). In contrast to TGEV, PRCV exhibited no 
clinical signs of disease (Pensaert et a/., 1986; Duret et 
al., 1988; Wesley et a/., 1990b). 

The antigenic cross-reaction among isolates of 
TGEV and PRCV has been clearly documented (Calle- 
baut et a/., 1988; Garwes et a/., 1988; Sanchez et ai., 


* To whom reprint requests should be addressed. 


0042-6822/92 $5.00 
Copyright © 1992 by Academic Press, Inc. 
All rights of reproduction in any form reserved. 


92 


© 1992 Academic Press, inc. 


1990: Rasschaent et a/., 1990; Wesley et a/., 19906). 
Both types of viruses have common antigenic determi- 
nants in the three structural proteins: spike (S), mem- 
brane (M), and nucleoprotein (N). The absence of two 
antigenic sites in the S protein of the PRCV isolates has 
been the base for their differentiation from the enteric 
viruses (Sanchez et a/., 1990). Sequencing of the S 
gene of a French PRCV isolate (Rasschaert et ai., 
1990), and of a 200-nucleotide (nt) fragment of the S 
gene of a North American PRCV isolate (Wesley et a/., 
1990a) has revealed that both S proteins contain, at 
comparable locations within the protein, a single dele- 
tion of 224 and 227 amino acids, respectively. These 
isolates also showed deletions which were different in 
each virus in the genes coding for the nonstructural 
proteins, mapping downstream to the 3’-end of the S 
gene (Britton, 1990; Rasschaert et a/., 1990; Wesley et 
al., 1991). PRCV was transmitted by aerosols and has 
now been detected in most European countries (En- 
juanes and Van der Zeijst, 1992). It has been proposed 
(Enjuanes and Van der Zeijst, 1992) that PRCV be- 
haves as a natural vaccine against TGE, which makes 
the study of its origin and evolution interesting. The 
analysis of the genetic relationship among these respi- 
ratory isolates and others with respiratory tropism 
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would allow us to determine the molecular basis of 
their tropism and evolution. 

in this manuscript we describe the genetic homol- 
ogy among eight respiratory and three enteric isolates 
of the TGEV antigenic cluster, which identified amino 
acids potentially involved in receptor binding sites and 
conserved areas of the S gene. Based on these viral 
sequences, an evolutionary tree and mechanisms for 
TGEV evolution have been proposed. 


MATERIALS AND METHODS 
Cells and viruses 


All viruses were grown in swine testicle (ST) cells 
(McClurkin and Norman, 1966). The characteristics of 
the viruses are described in Table 1. For simplicity, the 
viruses are named in the text with three letters indicat- 
ing their geographical origin or classical name, fol- 
lowed by two numbers indicating the earliest year of 
isolation as reported in the literature. The antigenic 
characteristics of most of these viruses have been pre- 
viously reported (S4nchez et a/., 1990). Viruses were 
purified as described (Correa et a/., 1990), 


Virus proteins 


Protein analysis was performed after dissolution (1 
29/20 ul) in 0.1 M sodium acetate, pH 7, 0.5% sodium 
dodecyl sulfate (SDS), 1 u4 phenylmethylsuifonyl fluo- 
ride (PMSF), 0.1 wi N-e-p-tosyl-L-lysine chloromethy| 
ketone (TPCK), and 1 ug/ml pepstatin. When indicated, 
proteins were deglycosylated by incubation overnight 
at 37° with protein N-glycosidase F (0.04 U/ml, 
Boehringer-Mannheim), and the reaction was stopped 
by freezing. Protein were subjected to SDS—7.5% poly- 
acrylamide gel electrophoresis (PAGE) after the sam- 
ples were reduced with 5% 2-mercaptoethanol 
(Laemmli, 1970). Finally the proteins were detected by 
silver staining (Ansorge, 1985). 


RNA sequencing 


RNA was extracted from purified virions as de- 
scribed by Gebauer et a/. (1991). RNA was sequenced 
by oligodeoxynucleotide primer extension and dideox- 
ynucleotide chain termination procedure (Sanger et a/., 
1977) using the protocol described by Fichot and Gir- 
ard (1990). For RNA sequencing, primers complemen- 
tary to the S gene (Gebauer et a/., 1991) were used. 
Sequence data were assembled using the computer 
programs of the Genetics Computer Group (University 
of Wisconsin). 


Evolutionary tree 


Sequence information has been analyzed following 
standard phylogenetic methods. The distance be- 
tween each pair of nucleotide sequences was esti- 
mated using the formula d = —(S)In(1 — 4 p/3) L Jukes 
and Cantor, 1969), where p is the proportion of 
changed nucleotides displayed by the compared se- 
quences, and L is the length of the sequences after 
alignment. The two gaps introduced to align the se- 
quences were excluded from the calculations. The 
neighbor-joining method (Saitou and Nei, 1987; Sour- 
dis and Nei, 1988), as implemented in the program 
TREEDIST (available from J.D. upon request), was used 
to obtain a phylogenetic tree from the pairwise dis- 
tance matrix. A parallel phylogenetic analysis was 
carned out using the least squares method (Fitch and 
Margoliash, 1967), utilizing the program FITCH from 
the PHYLIP package, version 3.3 (Felsenstein, 1990). 
The reliability of the tree, i.e., the confidence levels for 
branching order, was determined by the bootstrap 
method (Efron, 1982; Felsenstein, 1985). A high num- 
ber of bootstrap replicates of the original set of se- 
quences was obtained. For each replicate a phylogen- 
etic tree was obtained as described above. Hence, a 
consensus topology for the tree, as well as confidence 
intervals for each branching point (Felsenstein, 1985) 
were obtained by applying the program CONSENSE, 
also from the PHYLIP package. Automatized derivation 
of bootstrap replicates, distance matrices, and neigh- 
bor-joining tree estimations were provided by the 
TREEDIST program. 

The origin of the phylogenetic tree was estimated by 
a lineal least squares procedure (Sokal and Rohlf, 
1981). We assumed a constant average rate of fixation 
of mutations. This procedure determines the origin, 
finding the point in the tree that minimizes the sum of 
the squares of a lineal least squares fit, and relates the 
distances between each isolate and this point to isola- 
tion dates. The slope of the line provides an estimate of 
the rate of fixation of mutations. The interception of the 
line with the horizontal axis (time) gives an estimate of 
the origin of the TGEV antigenic cluster of viruses. 
Errors and confidence intervals were calculated for the 
slope and the intercept with the time axis (Sokal and 
Rohlf, 1981). 


RESULTS 


Structural proteins of enteric and respiratory 
porcine coronaviruses 


Both enteric and respiratory TGEVs have been stud- 
ied. The respiratory viruses could be grouped in two 
clusters, one lacking antigenic sites B and C (the 
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TABLE 1 


CORONAVIRUSES USED IN THIS PUBLICATION 


Designation 


Origin (year of isolation) 


Dominant tropism 


TGEV 


PUR46-MAD-CC 120 


PUR46-PAR-CC120 


PUR46-UTR-CC 120 


MIL65-AME 


BRI70-FS772 


NEB72-RT 


TOY56-CC168 


PROV 


HOL87-V78-CC5 


BEL85-83-CC3 
BEL87-31-CC5 
FRA86-RM4 


ENG86-I-CC5 


ENG86-II-CC& 


Purdue University, 
Indiana (1946) 


idem 


idem 


Ohio (1965 or before) 


England (1970) 


Nebraska (1972) 


Japan (1956) 


The Netherlands (1987) 


Belgium (1985) 
Belgium (1987) 
France (1986) 


England (4986) 


England (1986) 


Enteric & Respiratory 


idem 


idem 


idem 


idem 


Respiratory 


Respiratory 
(sporadically 
isolated in enteric 
tissues) 


Respiratory 
idem 
idem 
idem 


idem 


idem 


Characteristics 


Reference 


Enteric virus originally 
isolated by Bohl. 
Passaged 120-fold on 
ST cells. Reference 
clone used in our 
laboratory. 

Same origin as PUR46- 
MAD-CC120 Clone 
used by H. Laude’s 
group. 

Same origin as PUR46- 
MAD-CC120 Clone 
used at Utrecht 
University 

Virulent. Passed in vivo 
Plaque purified three 
times on ST cells. 

Maintained by passage in 
primary cultures of 
thyroid cells 

Isolated from the lungs of 
a healthy adult pig. 
Passaged in the iungs 
of gnotobiotic pigs. 
Passaged in vitro in 
lung cells and on ST 
cells. 

Received at passage 163 
in swine kidney cells. 
Passaged 5 times on 
ST cells. 


Originally isolated on ST 
cells and passaged 5 
times on this cell line 

idem 
idem 
idem 


Isolate PVC-135308 
originally grown on 
primary pig kidney cells 
and passaged 5 times 
on ST cells 

Isolate PVC-137004, 
isolated and passaged 
as PVC-135308 


Bohl et a/., 1972 
Sanchez et al., 1990 
Gebauer et a/., 1991 


Bohl et a/., 1972 
Rasschaert and Laude 1987 


Bohl et a/., 1972 
Jacobs et a/., 1987 


Wesley, 1990 


Garwes et a/., 1978 


Underdhal et al, 1974 
This manuscript 


Furuuchi et a/., 1976 
Sanchez et a/., 1990 


Pensaert et a/., 1986 
S4nchez et a/., 1990 


idem 
idem 
Duret et a/., 1988 
Rasschaert et a/., 1990 
Brown and Cartwright, 1986; 
Garwes et a/., 1988: 
SAnchez et a/., 1990. 


idem 


PRCVs) and another with these antigenic sites (NEB72 
and TOY56) (Sanchez et a/., 1990). The molecular 
weight of the structural proteins of the TGEVs and 
PRCVs listed in Table 1 were determined, with the ex- 
ception of those from the isolates BRI70 and FRA86, 
which have not been analyzed in this study. These mo- 
lecular weights were estimated by SDS-PAGE analy- 


sis. The mobility of the M and N proteins of all viruses 
was similar (data not shown). In contrast, the TGEV S 
glycoproteins and the apoproteins, obtained by degly- 
cosylation with protein //-glycosidase, had higher mo- 
lecular weight (200 and 158 kDa, respectively) than the 
S glycoproteins and apoproteins of the PRCVs (170 
and 130 kDa, respectively). The results for the stan- 
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ENTERIC RESPIRATORY 
VIRUS VIRUSES 
MW PUR 46 NEB72 TOY5S6 BEL85 BEL87 HOL 87 
MARKERS - + - + - + - + - + - + 
based - 
116K — = — i [= 


Fic. 1. PAGE analysis of the spike protein of TGEV-related corona- 
viruses before and after deglycosylation. Purified viruses were disso- 
ciated (1 4g/20 ul) in 0.1 M sodium acetate, pH 7, with 0.5% SDS 
and protease inhibitors, and incubated overnight at 37° in the pres- 
ence (++) or absence (—) of protein A/-glycosidase F (0.04 U/l). The 
proteins were separated by 7.5% PAGE in the presence of 0.1% SDS 
and 2-mercaptoethanol and detected using silver staining (Ansorge, 
1985). Only the gel area corresponding to the S glycoprotein is 
shown. 


dard PUR46 strain, two TGEV strains (NEB72 and 
TOY56) with respiratory tropism, and three PRCVs 
(BEL85, BEL87, and HOL87 are shown as representa- 
tive data (Fig. 1). These results indicate that all the 
European PRCVs studied have an S protein of similar 
molecular weight (170 kDa}, which is smalier than the 
S glycoprotein of TGEV. In addition, they demonstrate 
that other isolates with an almost exclusive respiratory 
tropism (NEB72 and TOY56) do not have a reduction in 
molecular weight as were detected in the PRCV iso- 
lates. 


Sequence analysis of the S-glycoprotein of TGEVs 
and PRCVs 


To determine the relationship among the different 
European PRCV isolates, the complete S gene se- 
quence of PRCV HOL87, TOY56, and NEB72 respira- 
tory isolates were determined by sequencing the RNA 
from purified virions (Fig. 2). In addition, the first 1956 
nt of the S gene of other four PRCV strains were deter- 
mined (Fig. 2). The nucleotide or amino acid positions 
reported in this manuscript refer to the location of 
equivalent residues in the sequence of MIL65 virus, 
which has the largest S gene reported for TGEV-related 
isolates. The 5’-terminal segments sequenced codes 
for the four antigenic sites previously defined, which 
are located in the globular part of the peplomer (Ge- 
bauer et a/., 1991). The sequences were aligned with 
those of the PRCV FRA86 (Rasschaert et a/., 1990) and 
of prototype enteric viruses (Fig. 2). Two deletions 


were observed which have been diagrammatically 
summarized in Fig. 3A. One of them removed 224 
amino acids, starting at residue 21 of the unprocessed 
glycoprotein. The second deletion removed 2 amino 
acids after residue 374. Taking into account the two 
deletions and the sequence homology among the S 
genes of these isolates, three sets of viruses could be 
distinguished: (i) one including BEL85, FRA86, HOL87, 
BEL87, ENG86-I, and ENG86-II strains with a 224-aa 
deletion which was identical both in terms of the num- 
ber of residues deleted and the location of the deletion; 
(ii) @ Second set including PUR46 and NEB72 isolates 
with a deletion of 2 aa; and (iii} a third set grouping 
MIL65, BRI70, and TOY 56 strains, which had no dele- 
tion. Although the NEB72 and TOY56 isolates have 
respiratory tropism, they do not contain the 224- 
amino-acid deletion. These viruses have point muta- 
tion differences with the enteric viruses (Fig. 2). The 
NEB72 isolate has only two amino acid differences 
when compared to the PUR46 isolate in the S protein, 
not shown by other enteric isolates. One of them (aa 
219) falls within the deletion present in the PRCVs. 
NEB72 isolate is closely related to PUR46 strain since, 
in addition, both viruses have the 2-aa deletion (resi- 
dues 375 and 376) and almost identical sequences in 
the ORFs 3, 3-1, and 4, corresponding to nonstructural 
proteins (data not shown). The TOY56 isolate has three 
amino acid changes (residues 92, 94, and 218) within 
the deletion present in the PRCVs, in relation with the 
PUR46 strain, which are specific for the TOY56 isolate. 
The enteric isolates BRI70-FS and MIL65-AME have 
also a change in aa 218, from valine to threonine, 
which is different than the change to isoleucine that 
occurred in the TOY56 isolate. 

The amino acid homology between the S$ protein of 
PRCVs and TGEVs was independently studied at the 
globular and the stem portion of the molecule (data not 
shown). The same overall degree of homology in the S 
proteins was found in both the globular and stem 
areas. The amino acid homology was higher than 98% 
among both the TGEV and the PRCV isolates. In con- 
trast, the overall S protein homology between TGEVs 
and PRCVs was around 1% lower. Although this per- 
centage difference is small, the fact that these viruses 
have the amino acids changed in almost identical loca- 
tion, makes this difference significant. In these compar- 
isons, only the S protein segments for which the se- 
quences of the 13 viruses were available have been 
considered. A large conserved domain was identified 
in the globular portion of the S protein of TGEVs and 
PRCVs, between amino acids 405 and 465, when the 
number of amino acid changes was plotted versus 
their position in the sequence (Fig. 3B). Furthermore no 
amino acid changes were detected in this segment 


10 30 50 70 90 110 


PUR46-MAD ATGAAAAAAC TATTIGTGGTTTIGGTCGTAATGCCATIGATTTA TGGAGACAATTTTCCTIGTICTAAATIGAC TAATAGAACTATAGGCAACCAGTGGAATCTCATIGAAACCTICCTT 
M KK Lb F VV Gb VV M Pb TY GDNF PCS KLTNRTIGNOQOWNLUI ETF OL 


NEB72 
TOYS6 
BRI70-FS 
MIL65-AME 
BEL85~-83 
FRA86-RM 
ENG86-I 
ENG86-II 
HOL87 
BEL87-31 


Dee ee a ee 
HAHBHAAASYA 


aaANAAN 


130 150 170 190 210 230 


PUR46-MAD CTAAACTATAGTAGTAGGTTACCACCTAATTCAGATGTGGTGTTAGGTGATTATTTICCTACTGTACAACCTIGGTTTAATTGCATICGCAATGATAGTAATGACCTTTATGTIACACTG 
LN YS S RLPPpPNsS DJ)VVLGDYF PTV QPWFNC IRN SN DLYVTL 
NEB72 Site C 
TOYSE T 
BRI70-FS T 
MIL65-AME T T 
BEL85~-83 
FRA86-RM 
ENG86-I 
ENG86-II 
HOL87 
BEL87-31 


250 270 290 310 330 350 


PUR46-MAD GAAAATCTTAAAGCATTGTATIGGGATTATGCTACAGAAAATATCACTIGGAATCACAGACAACGGTTAAACGTAGTCGTTAATGGA TACCCATACTCCATCACAGTTACAACAACCCGC 
ENLKALYWODYATENI TWIN HR QRLENVVVNGY PYSITVTTATR 
NEB72 Site B 
TOYS6 
BRI70-FS c 
MIL65~-AME 
BELES=83°  seccao 6 2biéveiereco 8 eo Sewsd, 6 esse ore 8) wis 6a wen fark aie '8 so Bra ONO'S Soba OS CE 68K area SSS S 8 GE OTe 10 yO TNT ETN Tore OLS Sse aie6 4 SUNS BID RI ENg a Sie ge SAe eral SiaL6 O10 S006'9-4)4 6S eA 
FRABGTRMS ic sie fo tas ae fo 5 ee ee did ee aetna Se ava epdpese ee: gata a Pe Wey ce ieney ees ete e el woe hale eS suena w Sreie Siege Bd ave Whee Seer alene bo Meee wie alae Saba eee ad 
ENGB 6105s Siero eres, 8: pine FS: bby w wield Owinfe G8 wie vie 0:0 Web Ww SLES ere enSiwie sie ecw S'S'S © iW DieIS wre 915, b'w oo eS TeLee SIE ENO Ein Sige wo we OR eS wb eS ate gore Males aielate eee 
ENG86-IT cece c eee cer ee en crvercves eee eee eee nenaee eee n cease enes See era See err eee ee eens beeen cence cere sees 
HOLS? shee ies dace iceerw whet anye; ajo; bree 0 6 Sresatin aval eia gfe! S00 Siw we Wig igs eeteje Sine leis gare. od Biae\wiarpreleie Bile Wb BO Ole aa inte a Sibel we he terete es Sees Sie Roe ra: 4410, Sie ae sealer oor dvereie oe6 
BELO F237 cet eek. ste aS as 5 Melee ath ck icbeg be ince wd Sani Bit gs Meaenendil wean a eaten e Giatane a Sette De mip areiehe ads Srevahs. oni e a ees ea ele, Sea oat RNY Ae a idabes Bymiate ak aecvole aia t 


370 390 410 430 450 470 


PUR46-MAD AATTTTAATTCTGCTGAAGGTGCTATTATATGCATTTGTAAGGGCTCACCACCTACTACCACCACAGAATCTAGTTTGACTTGCAATTIGGGGTAGTGAGTGCAGGTTAAACCATAAGTTC 
NFNS AEGATItIcrekcs PPTTTTEPSLTCNWGS ECR LUN H XK F 

NEB72 site B 

TOYS6 

BRI70-FS c c 


490 510 530 $50 570 590 


PURA6-MAD CCTATATGTCCTICTAATTCAGAGGCAAATIGTGGTAATA TGCTGTA TGGCCTACAATGGTTIGCAGATGAGGTIGTIGCTTATTTACA TGGTIGCTAGTTACCGTATTAGTTTTGAAAAT 
P IcePSN SS EANCGN*MLY GLOW FAD EVV AY LHGA SS Y RIS FEN 


NEB72 
TOY56 
BRI70-FS 
MIL65-AME 
BEL85-83 
FRA86-RM 
ENG86-T 
ENG86~-II 
HOL87 
BEL87-31 


610 630 650 670 690 710 


PUR46-MAD CAATGGTCTGGCACTGTCACATTIGGTGA TA TGCGTGCGACAACATTAGAAGTCGCTGGCACGCTIGTAGACCTTIGGTGGTTTAATCCTGTTTATGATGTCAGTIATIATAGGGTTAAT 
Q@ws GTVTFGODBMRATTLE VA GTLVY DL WW FN PV ¥ DVS ¥ ¥ R VN 
NEB72 
TOYS6 
BRI70~-FS 


Fig. 2. Sequence alignment of spike (S) protein genes of TGEVs and PRCVs. The nucleotide sequence of the S gene and the deduced aa 
sequence of the PUR46-MAD virus are shown in the two first lines. In the other lines, the nucleotide changes in the sequences of other viruses 
have been indicated. Nucleotide changes resulting in amino acid changes have been shadowed. In the alignment deleted residues have been 
filled out with points. Sequence numbers indicate the positions that the residues would have in the MIL65 virus. For simplicity, the sequences of 
two clones of the PUR46 isolate (PUR46-PAR and PUR46-UTR) have been omitted in this series of sequences, since they show minor changes 
and their sequences were previously published. The sequences of the strains FRA86-RM, MIL65-AME, BRI70-FS, PUR46-PAR, and PUR46-UTR 
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730 750 770 790 810 830 


PUR46~MAD AATAAAAATGGTAC TACCGTAGTTTCCAATIGCACTGATCAA TGTGCTAGTTATGTGGCTAATGTTTTTACTACACAGCCAGGAGGTTTTIATACCATCAGATTTTAGTTTTAATAATTGG 
NKENGTTVVSNCTDQOCASYVANVFTTQPGGFIPS DFS ENN W 


aaaaaaann 


850 870 890 910 930 950 


PUR46-MAD TICCTTCTAACTAATAGCTCCACGTIGGTTAGTGGTAAA TTAGTTACCAAACAGCCGTTATTAGTIAATIGCTTA 1GGCCAGTCCCTAGCTTITGAAGAAGCAGCTICTACATTTIGITTT 
FLLTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAAS TEC E 
NEB72 


TOYS6 
BRI70-FS 
MIL65-AME 
BEL85~-83 
FRA86-RM 
ENG86-I 
ENG86-IT 
HOL87 
BEL87-31 


aaaaada ” 


970 990 1010 1030 1050 1070 


PUR46-MAD GAGGGTGCTGGCTTTGATCAATGTAATGGTGCTGTTTTAAATAATACTGTAGACGTCATTAGGTTCAACCTTAATITTAC TACAAATGTACAA TCAGGTAAGGGTGCCACAGTGTITICA 
EGA GFopD@gqe¢eNnNGaAvVLNN TV ODVI RF NLN FTTNV OS GK GAT V F § 
NEB72 
TOY5S6 
BRI70-FS 
MIL65-AME 
BEL85-83 
FRA86-RM 
ENG86-I 
ENG86-II 
HOL87 
BEL87-31 


Cc 


PPP PrP > PP 
aAgAAaNAN A 
GB daeeAsAs 
Asiana 


1090 1110 1130 1150 1170 1190 


PUR46-MAD TTGAACACAACGGGTGGTGTCACTCTTGAAATTICATGTIAT...... ACAGTGAGTGAC TCGAGCTITTTCAGTTACGGTGAAATTCCGTTCGGCGTAACTGA TGGACCACGGTACTGT 
NTTGGvTLEtTscyYy..fTfvS DSS FF SS ¥YGIEITPFGVTDGPRYC 
NEB72 Site D 
ToYS6 
BRI70-FS 
MIL65-AME 
BEL85-83 
FRA86-RM 
ENG86-I 
ENG86-IT 
HOL87 
BEL87-31 


ce 
Cc 
c 
Cc 
c 
c 
Cc 
Cc 


Hoda” 
BHA 


1210 1230 1250 1270 1290 1310 


PUR46-MAD TACGTACACTATAATGGCACAGCTCTTAAGTATTTAGGAACATTACCACCTAGTGTCAAGGAGATTIGCTATTAGTAAGTGGGGCCATTTTTATATTAATGGTTACAATTTICTITAGCACA 
Y V H ¥Y NGTAL KY LGToLP Pg§ V KET AT § K WGH F ¥ ING YN F F § 'T 
NEB72 


TOY56 
BRI70-FS 
MIL65-AME 
BEL85-83 
FRA86-RM 
ENG86-I 
ENG86-IT 
HOL87 
BEL87-31 


aagaaaaqa ann 


1330 1350 1370 1390 1410 1430 


PUR46-MAD TTTCCTATIGATIGTATATCTTTTAATTIGACCACTGGTIGATAGTGACGTTTICTGGACAATAGCTTACACATCGTACACTGAAGCATTAGTACAAGTTGAAAACACAGCTATTACAAAG 
FPIDCISFNLTTGDSDVFWTIAYTSYTEALVQOVENTA IMT XK 


NEB72 

TOYS6 

BRI70-FS 

MIL65-AME 

BEL85-83 Tt ¢c 
FRA86-RM T Cc 
ENG86-I Hy c 
ENG86-I1 Zh ¢c 
HOL87 T c 
BEL87~-31 vy Cc 


have been previously reported (Britton and Page, 1990; Jacobs et a/., 1987; Rasschaert and Laude, 1987; Rasschaert et a/., 1990; Wesley, 
1990). Sequence indeterminations have been coded as: K for G or T; X for G, A, T, C, or any amino acid; S for C or G; and Y for C or T. Underlined 
amino acids correspond to the signal peptide. Residues in boxes are involved in the indicated antigenic sites. Asterisks indicate the 3’-end of the 
segments sequenced. Dashes indicate nonsequenced segments. The nucleotide sequence data reported in this paper have been submitted to 
the GenBank nucleotide sequence database and have been assigned the accession numbers: PUR46-MAD, M94101; NEB72-RT, M94099; 
TOY56, M94103; HOL87, M94097; BEL85-83, M94096; BEL87-31, M94098; ENG86-|, M94100; ENG86-II, M94102. 
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1450 1470 1490 1510 1530 1550 


PUR46-MAD GTGACGTATIGTAATAGTCACGTTAATAACATTARATGCTCTCAAATTACTSCTAATTIGAATARTSGATTTTATCCTTTICTICAAGTGAAGTIGGTCTIGTCAATAAGAGTGTTGTS 
- T ¥Y C NS HV NNIKCSQITANLNNGFY PVS S S EVGLVNKS VV 
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PUR46-MAD TTACTACCTAGCTTTTACACACATACCATIGTTAACATAACTATTGGTICTIGGTA TGAAGCGTAGTGGTTATGGTCAACCCATAGCCTCAACATTAAGTAACATCACACTACCAATGCAG 
BPS SEs EA Ae Vn Ee aed eee (Le AB ere (SO eae cee Ge ES ey eae 
NEB72 Site a site A 
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PUR46-MAD GATCACAACACCGATGIGTACTGTATICGTICTGACCAATTTTCAGTTTATGTTICATTC TACTTGCAAAAGTGC TT TA TGGGACAATATTITTAAGCGAAACTGCACGGACGTTTTAGAT 
DHN TDVY¥Ye¢ei%#sIR Ss DQ@FS VY¥YVHS TC KSA LW NI F kK RJN c T DVu.LoD 
NEB72 site aA Site A 
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PUR46-MAD GCCACAGCTGTTATAAAAACTGGTACTTGTCCTTICTCATTIGATAAATIGAACAATTACTTAACTTTTAACAAGTICIGTTIGICGTIGAGTICCTGTIGGTGCTAATTIGTAAGTTIGAT 
AvtIK TG T ¢ P F § FDK LUNN YLUTtF N KF CES LS PV GAN CK F D 
NEB72 
TOYS6 
BRI70-FS 
MIL65-AME, 
BEL85-83 
FRA86~-RM 
ENG86-TI 
ENG86-II 
HOL87 
BEL87-31 
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PUR46-MAD GTAGCTGCCCGTACAAGAACCAA TGAGCAGGTIGTTAGAAGTTIGTATGTAA TATA TGAAGAAGGAGACAACA TAGTGGGTGTACCGTCTGATAA TAGTGGTGTGCACGATTTGTCAGTG 
V AAR TRTNEQVVRSBUYVIYEEGDNIVGVPS DNS GCGVHODLS V 
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PUR46-MAD CTACACCTAGATTCCTGCACAGATTACAATATATA TGGTAGAACTGGTGTIGGTATTA TIAGACAAACTAACAGGACGCTACTIAGTGGCTTATATTACACATCACTATCAGGTGATTTG 
LHULODscT DY NI ¥Y GRFGVGIIiIR TNRTLLS GLY ¥ TS LS GDL 

NEB72 

TOYS6 
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PUR46-MAD TTAGGTTTTAAAAA TGTTACTGATGGTGTCATCTACTCTCTAACGCCA TGTGATGTAAGCGCACAAGCAGCTGTTATTGA TGGTACCATAGTTGGGGCTATCACTICCATTAACAGTCAA 
L GF K NV S DGVtI¥ SVT PCDVSAQAAVIDGTIVGAtTIT SIN SE 
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FRA86-RM T T 
HOL87 T T 
2290 2310 2330 2350 2370 2390 


PUR46-MAD CTGTTAGGTCTAACACATTIGGACAACAACACCTAATTTTTATTACTACTCTATA TATAATTACACAAA TGATAGGACTCGTGGCACTGCAATTGACAGTAATGA TGTIGATIGTGAACCT 
LLuGUbLtTHWT TT T PN F XY ¥Y ¥ S TY NY TNODRTRGTA IDS NDVODCE P 
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PUR46-MAD GICATAACCTATTCTAACA TAGGTGTTTGTAAAAA TGGIGCTTTIGTTTTIATTAACGTCACACATTCTGATGGAGACGTGCAACCAATTAGCACTGGTAATGTCACGA TACCTACAAAC 
v ITY sNIGVCKNGA*FVFINVTHSDGODVQPISTGNVTIPTN 
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PUR46-MAD TTTACCATATCCGTIGCAAGTCGAATATATTCAGGTTTACACTACACCAGTGTCAA TAGACTGTICAAGATATGTTIGTAA TGGTAACCCTAGCTGTAACAAATIGTTAACACAATACGTT 
FTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLLUTQYV 
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PUR46-MAD TCTGCATGTCAAACTATTGAGCAAGCACTIGCAATGGGTGCCAGACTIGAAAACATGGAGGTIGATTCCATGTICTTIGTTICIGAARATGCCCTTAAATTGGCATCIGTIGAAGCATTC 
sSACQTIEQALA*MGARLENMEVODSMLEVSENALKLAS VEA F 
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PUR46-MAD AATAGTTCAGAAACTTTAGACCCTATTTACAAAGAA TGGCCTAATATAGGTGGTTCTIGGCTAGAAGGTC TAAAA TACATACTICCGTCCCATAATAGCAAACGTAAGTATCGTICAGCT 
NS S ET LDPI¥Y K EWPNIGGSWLEGLKYIiLPS HN S KR KY RSA 
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PURA6-MAD ATAGAGGACTIGCTTTTIGATAAGGTIGTAACATCTGGTTTAGGTACAGTIGA TGAAGATIATAAACGTIGTACAGGTGGTTATGACATAGCTGACTTAGTATGTGCTCAATACTATAAT 
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PUR46-MAD GGCATCATGGTGCTACCTGGIGTGGCTAATGCTGACAAAATGACTATGTACACAGCATCCCTTGCAGGTGGTA TAACATTAGGTGCACTTIGGTGGAGGCGCCGTGGCTATACCTTTTGCA 
GI™M“VLPGVANAOD KM TM Y TAS LAGGITLGEAL GGGAVATIPFA 
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3130 3150 3170 3190 3210 3230 


PUR46-MAD GTAGCAGTTCAGGCCTAGACTTAATTATGTTGCTCTACAAACTGATGTATTIGAACAAAAACCAGCAGATTC TGGCTAGTGCTTTCAATCAAGC TATIGGTAACATTACACAGTCATTTGGT 
vVaAvOQOaARLNYVALQTODVLUN KNOQI LAS A FNQOATI GNI TQS FG 
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PUR46-MAD AAGGTTAATGATGCTATACATCAAACATCACGAGGTCTIGCTACTGTIGC TAAAGCA TIGGCAAAAGTGCAAGATGTIGTCNACATACAAGGGCAAGCTTTAAGCCACCTAACAGTACAA 
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PUR46-MAD TIGCAAAATAATTTCCAAGC CATTAGTAGTICTATTAGIGACATTTATAATAGGCTIGACGAATTGAGTGCTGA TGCACAAGTTGACAGGCTGA TCACAGGAAGACTTACAGCACTTAAT 
LONNFQATISSSI1ISDIYNRLDELSADAQVDRLUITGRULUTALN 
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PUR46-MAD GCATTTGIGTCTCAGACTCTAACCAGACAAGCGGAGGTTAGGGCTAGTAGACAACTIGCCAAAGACAAGGTTAATGAATGCGTTAGGTCTCAGTCTCAGAGATICGGATTCIGIGGTAAT 
AFVS QTLTRQAEVRASRQLAKDKVNECVRSQOSQRFEFGFCGEN 


NEB72 

TOY56 

BRI70-FS A Cc c 

MIL65-AME c 

FRA86-RM c T c 

HOL87 c T c 
3610 3630 3650 3670 3690 3710 


PUR46-MAD GGTACACATTIGTTTTCAC TCGCAAATGCAGCACCAAATGGCATGATITICTTTCACACAGTGCTATTACCAACGGCTTATGAAACTGTGACTGCTTIGGCCAGGTATTIGTGCTICAGAT 
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PUR46-MAD GACTTTIGTTCAAATIGAAGGGTGCGATGTGCIGTTTGTTAATGCAACTGTAAGTGATTIGCCTAGTATTATACCTGATTATATTGATATTAATCAGACTGTTCAAGACATATTAGAAAAT 
DFvV@QtIEGCODVULFVNATVS DLPSTIIPDYI DINQTVQDI LEN 
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PUR46-MAD TTTAGACCAAATIGGACTGTACCTGAGTIGACATTIGACATTTTTAACGCAACCTATTTAAACCTGACTGGTGAAATIGA TGACTTAGAATTTAGGTCAGAAAAGCTACATAACACCACT 
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PUR46-MAD GTAGAACTIGCCATTCTCATTGACAACATTAACAATACATTAGTCAATCTIGAA TGGCTCAATAGAATIGAAACCTATGTAAAA TOGCCTIGGTATGTCTGGCTACTAATAGGCTTAGTA 
V EULA ILI DN IN NT LVNLEWLN RIE T ¥ VK W PW Y VWEETG LY 


NEB72 

ToYs56 T 

BRI70-FS T 

MILG5-AME T a 
FRA86-RM GE 
HOL87 


4210 4230 4250 


sa 


4270 4290 4310 


PUR46-MAD GTAATATTTIGCATACCATTACTGCTATTTIGCTGTIGTAGTACAGGTIGCTGIGGATGCATAGGTIGTTTAGGAAGTTGTIGTCACTCTATATGTAGTAGAAGACAATTIGAAAATTAC 
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when the sequences of 13 virus isolates were com- 
pared (Fig. 3B). 


Evolutionary tree for the S gene of TGEVs and PRCVs 


The nucleotide sequence of the S glycoprotein of 
eight respiratory and five TGEVs (three of which were 
different clones of the same PUR46 virus strain) were 
aligned taking into account the two deletions of 6 and 
672 nts present in the sequence of the PUR46 and 
PRCVs, respectively, for maximum fitness (Fig. 2). Phy- 
logenetic analysis of the sequences (first 1956 nt) of 
the viruses described in Fig. 2, by either the neighbor- 
joining or the least squares methods of tree-recon- 
struction procedures, gave two identical trees, with the 
same branching order, confidence levels, and branch 
lengths (Fig. 4). This congruence in the results, in addi- 
tion to the high confidence level along the tree, sug- 
gests a significant reliability for the evolutionary history 
described. The least squares relationship between the 
number of mutations from origin and the year of isola- 
tion was determined (Fig. 5). The extrapolation of this 
line to zero mutations allowed to predict that these 
TGEV were originated from a recent common ancestor 
circulating around 1941. Since then, from a main lin- 
eage, the PUR46, TOY56, MIL65, BRI70, and the 
PRCVs were derived in the indicated order (Fig. 4). Only 
one isolate (NEB72) accumulated a number of substi- 
tutions smaller than the one expected for its year of 
isolation. In at least three cases (TOY56, MIL65, and 
BEL85), it can be assured with a significance of 99.9%, 
that these were lateral lineages derived from one main 


lineage (see Discussion). The accumulation of muta- 
tions with time (Fig. 5) fits a straight line with a high 
Pearson coefficient correlation (r2 = 0.97). From the 
slope of this line, the mutations fixation rate can be 
estimated at 0.95 + 0.05 substitutions per year. 


DISCUSSION 


The structural proteins of seven new strains of the 
TGEV cluster with enteric and respiratory tropism have 
been analyzed. Also, the complete sequences of the S 
genes of three respiratory isolates and of the first 
1956-nt S gene of other four respiratory viruses of the 
TGEV antigenic cluster have been determined. These 
sequences, together with published ones of enteric 
and respiratory TGEV isolates, have been analyzed to 
determine the genetic homology between TGEVs and 
PRCVs. Key point mutations which might be responsi- 
ble for the loss of enteric tropism in certain isolates 
have been identified. In addition, a large conserved 
area in the S protein has been identified, and an evolu- 
tionary tree relating all these viruses has been pro- 
posed. 

TGEVs were described for the first time in 1946 
(Doyle and Hutchings, 1946). Respiratory variants of 
the enteric virus were isolated in 1956 (TOY56 strain) 
(Furuuchi et a/., 1976) and in 1972 (strain NEB72) (Un- 
derdahl et a/., 1974). Highly contagious respiratory iso- 
lates which rapidly extended throughout Europe, the 
PRCVs, were detected for the first time in 1984 (Pen- 
saert et a/., 1986). These viruses are serologically re- 
lated to TGEV and are missing antigenic sites B and C 
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NEG TORT beneneetenenrevessee )steteeeneemseeany jase 
TOY 56 TY NN 


PRCV BEL 85-83 2? OO resceyf—————— 
FRA 86-RM4 la a 
ENG 86-1 lem Se 
ENG 86-4 ba cee, {I 
BEL 87-31 a 
HOL 87 ha eee 
B 


NUMBER OF AMINO 
ACIDS CHANGED 


Fig. 3. Summary of the deletions and amino acid changes present 
in the S glycoprotein of TGEVs and PRCVs. (A) Location of the dele- 
tion. Full and empty bars indicate the sequences known and unde- 
termined, respectively. Letters indicate the approximate location of 
the antigenic sites. The numbers above these letters indicate amino 
acid residues involved in the formation of these sites. The position of 
the deletions is indicated by brackets, and the numbers next to the 
brackets show the amino acids flanking the residues deleted. (B) 
Number of amino acid changes in sequential fragments of 20 aa 
each, in relation with the PUR46-MAD virus sequence. Only the seg- 
ments for which the sequences of the 13 virus strains were available 
have been included in the comparison. Amino acid residues have 
been numbered according to their position in the MIL65 virus after 
the alignment. The origin of the sequences of the different strains 
has been indicated in the legend of Fig. 2. 


(Callebaut et a/., 1988; Sanchez et a/., 1990). In the five 
European isolates sequenced by us, the absence of 
these sites is due to a deletion of 224 amino acids, 
starting at residue 21. Identical deletion (both in terms 
of size and location) was described for another Euro- 
pean isolate (Rasschaert et a/., 1990). In 1989 a virus 
(IND89) with the antigenic characteristics of the Euro- 
pean PRCVs was isolated in the United States (Wesley 
et al., 1990b). Sequencing of the first 200 nt of the S 
gene showed a deletion of 227 amino acids starting at 
residue 23 (Wesley et a/., 1991), i.e., the deletion 
shifted downstream two residues, in relation to the po- 
sition of the deletion described for the European 
PRCVs. These data, together with the high sequence 
homology (Figs. 2}, and the phylogenetic tree obtained 
(Fig. 4) demonstrate that all six European PRCVs, iso- 
lated in four countries (Belgium, France, The Nether- 
lands, and United Kingdom) have a recent common 


ancestor. In contrast, the North American isolate is 
probably of independent origin, since (i) if it was derived 
from the European PRCVs the addition of several nu- 
cleotides after nt 59 or 60 and a deletion of a few nu- 
cleotides at the end of the deletion present in the Euro- 
pean PRCVs would have been required (the identity of 
the nucleotides at the beginning and at the end of the 
deletion leaves open the precise position of the dele- 
tion both in the European and in the North American 
PRCVs); (ii) differences between the genes coding for 
the nonstructural proteins of the European and North 
American isolates have also been reported (Rass- 
chaert et a/., 1990; Wesley et a/., 1991); and (ili) the 
200-nt sequence available for the North American iso- 
late placed this PRCV strain closer to the enteric iso- 
lates than to the European PRCVs in our evolutionary 
tree (results not shown). 

The four antigenic sites described in the S glycopro- 
tein of TGEV have been mapped into the NH,-terminus 
half of S protein (Delmas et a/., 1990; Gebauer et a/., 
1991). These sites are probably located in the globular 
part of the S molecule (De Groot et a/., 1987; C. Sufhé, 
M. Nermut, J. L. Carrascosa, and L. Enjuanes, unpub- 
lished results). In other coronaviruses the S glycopro- 
tein can be split into two subunits, St and $2, which 
probably contain the globular and stem portions of the 
molecule, respectively (Spaan et a/., 1988). The pre- 
cise residue where the stem part of the S peplomer 
might start awaits elucidation of Its atomic structure. 
The protein domain that includes antigenic subsites Aa 
and Ab and site D (Gebauer et a/., 1991) showed a 
slightly higher number of amino acid changes than did 
other areas of the S protein (Fig. 3). Nevertheless, the 
overall homology in the globular and stem areas of the 
S protein is similar in both nucleotide and amino acid 
levels (results not shown). This result was not antici- 
pated due to the higher antigenicity and presence of 
epitopes relevant in virus neutralization in the globular 
area. 

The receptor binding site in the S glycoprotein of 
TGEV that interacts with ST cells probably maps be- 
tween sites A and D since TGEV binding to ST cells is 
best inhibited by MAbs specific for these sites (Suhé et 
al., 1990). Candidate domains for the localization of 
this receptor binding site could be the highly con- 
served area identified between amino acids 406 and 
465 (Figs. 2 and 3), although other domains around 
this area can not be ruled out. A second RBS may be 
used to infect enteric cells by TGEVs with enteric tro- 
pism. This RBS might be located within the area of the 
S$ protein deleted in the PRCVs. More precisely, it could 
be located around either aa 92, 94, and 218 or aa 219, 
changed in the TOY56 or in the NEB72 isolates, re- 
spectively. Interestingly, both viruses have an amino 
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Fia. 4. Evolutionary tree of TGEV related coronaviruses. Neighbor-joining and least squares methods of tree reconstruction procedures were 
applied to the first 1956 nt of 13 virus isolates (the 11 isolates indicated in Fig. 2, and the clones PUR46-PAR and PUR46-UTR previously 
reported) (Table 1). Numbers in the diagram indicate residue substitutions between branching points. A indicates the introduction of a deletion 
between branching points. *indicates that all the descendents of this fork have, with a probability of 99.99%, a recent common ancestor. 


acid change in contiguous residues (218 and 219), 
suggesting that these residues may be involved in the 
RBS. Tissue-specific tropism of coronaviruses is con- 
ditioned by the S glycoprotein, and different RBSs in 
this protein could be recognized by the respiratory or 
the enteric tissues. Nevertheless, other viral or cellular 
regulatory mechanisms affecting essential steps of 
virus replication, other than virus-to-cell binding, could 
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Fic. 5. Relationship between mutation fixation rate and year of 
isolation. The line relating the number of mutations from origin with 
the year of isolation was plotted. Line and origin were estimated at 
the same time by Jinear least squares fit. The expression for the line 
was: d = 0.95 t — 1893, r® = 0.97, where d is the distance to the 
origin, tis the time in years, and r? the Pearson's correlation coeffi- 
cient (Sokal and Rohlf, 1981). The data correspond to the viral iso- 
lates used in the construction of the evolutionary tree (Fig. 4). The 
line with an minimum square error was determined and represented. 
The point showing minimum fitness with the line corresponds to the 
NEB72 isolate. 


influence viral tropism (Levine, 1984). Genes control- 
ling those regulatory mechanisms could map to areas 
away from the S gene. Studies based on recombina- 
tion between TGEVs with enteric and respiratory tro- 
pism will help to identify the existence of these genes. 

Based on nucleotide sequencing data (Fig. 2), an 
evolutionary tree has been proposed which provides a 
relationship among 13 PRCV and TGEV isolates (Fig. 
4). Since we are dealing with a limited number of iso- 
lates from each continent, it is understood that the in- 
clusion of additional sequences from isolates of other 
areas (j.e., Japan} could show that certain lateral 
branches may become main branches. Only one iso- 
late (NEB72) was out of place. According to the evolu- 
tionary tree, it should have been isolated at the same 
time as the PUR4E6 strain, since !t has accumulated a 
similar number of point mutations and has the same 
6-nt deletion which is present in the PUR46 strains. 
NEB72 probably represents a virus reintroduction, as 
the ones described in other viral systems (Beck and 
Strohmaier, 1987; Carrillo et a/., 1990). Least squares 
estimation of the origin of TGEV related coronaviruses 
demonstrates a significant constancy in the fixation of 
mutations with time, that is, the existence of a well-de- 
fined molecular clock (Kimura, 1983). The mutation fix- 
ation rate is of 0.95 + 0.05 substitutions per year. As 
this rate was measured for 1260 nt, it can be ex- 
pressed as 7 + 2 X 10°* substitutions per nucleotide 
and per year. This rate falls in the range reported for 
other RNA viruses (Domingo and Holland, 1988). The 
direction defined for the evolutionary process from the 
predicted origin Supports the occurrence of two dele- 
tions: one of 6 nt in the lineage from the root to PUR46 
strains and another of 672 nt in the lineage leading 
from TGEV to PRCVs. It may be concluded that the 
European PRCVs have been derived by a 672-nt dele- 
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tion from an enteric TGEV, since we have examined 
isolates preceding the PRCVs. In contrast, it cannot be 
guaranteed that the PUR46 emerged by a 6-nt deletion 
from an unknown ancesior. An alternative explanation 
could be that the other enteric isolates shown in Fig. 4 
could have been derived from PUR46 by the addition of 
6 nt. 

It is interesting to note that the area deleted in the 
TGEV S gene to form the PRCVs contain repeated tetra- 
meric (TTCC) or heptameric (AGTTTCC) sequences. 
These repeated sequences could be involved in inter- 
or intramolecular recombinations, by a copy choice 
mechanism (Lai, 1992), which could have originated 
the deletion. In coronaviruses and other RNA viruses 
containing positive-strand RNA genomes, recombi- 
nant clones have been isolated with borders at the 
crossover sites containing some sequence similarity 
(Banner and Lai, 1991; Raffo and Dawson, 1991; Cas- 
cone et a/., 1990). Since the putative crossover ob- 
served in the generation of PRCVs does not happen at 
homologous sequences, the deletion might have been 
originated by nonhomologous recombination. This 
mechanism has been previously involved in the evolu- 
tion of coronaviruses, Sindbis virus, and plant viruses 
(Banner and Lai, 1991; Monroe and Schlesinger, 1983: 
Bujarski and Dzianott, 1991). If recombination has 
been the cause of the deletion present in the PUR46 
and PRCVs, then two mechanisms of evolution would 
be involved in the antigenic variation of TGEV, point 
mutations and recombination. 


ACKNOWLEDGMENTS 


We are grateful to J. A. Garcia for critical comments on the manu- 
script, to J. Palacin for his excellent technical assistance. F.G., and 
C.S. received fellowships from the Spanish Ministry of Education 
and Science. This investigation has been founded by grants from the 
Comisi6n Interministerial de Ciencia y Tecnologia (Projects BIO 
0214 and BIO 89-0668-603-03}, European Communities (Project 
BAP 0464 —), NATO (Project CRG 900430), and Fundaci6n Ramén 
Areces. 


REFERENCES 


ANsorce_, W. (1985). Fast and sensitive detection of proteinand DNA 
bands by treatment with potassium permanganate. / Biochem. 
Biophys. Methods 11, 13-20. 

BANNER, L. R., and Lal, M. M. C. (1991). Random nature of corona- 
virus RNA recombination in the absence of selection pressure. 
Virology 185, 441-445. 

Beck, E., and STROHMAIER, K. (1987). Subtyping of European foot- 
and-mouth disease virus strains by nucleotide sequence determi- 
nation. /. Virol. 61, 1621-1629. 

BouL, E. H., Gupta, R. K. P., OLquin, M. V. F., and Sair, L. (1972). 
Antibody responses in serum, colostrum and milk of swine after 
infection or vaccination with transmissible gastroenteritis virus. 
Infect. Immun. 6, 289-301. 

Britton, P., and Pace, K. W. (1990). Sequence of the S gene froma 


virulent British field isolate of transmissible gastroenteritis virus. 
Virus Res. 18, 71-80. 

Britton, P., PAGE, K. W., MAwpiTT, K., and Pocock, D. H. (1990). 
“Sequence Comparison of Porcine Transmissible Gastroenteritis 
Virus (TGEV) with Porcine Respiratory Coronavirus,’’ Seventh Inter- 
national Congress of Virology, pp. P6-O18. IUMS, Berlin. 

Brown, J., and CARTWRIGHT, S. F. (1986). New porcine coronavirus? 
Vet. Rec. 119, 282-283. 

Buyarskl, J. J., and DziANoTt, A. M. (1991). Generation and analysis of 
nonhomologous RNA-RNA recombinants in Brome mosaic virus: 
Sequence complementarities at crossover sites. /. Virol. 65, 
4153-4159. 

CALLEBAUT, P., CORREA, |., PENSAERT, M., JIMENEZ, G., and ENJUANES, 
L. (1988). Antigenic differentiation between transmissible gas- 
troenteritis virus of swine and a related porcine respiratory corona- 
virus. £ Gen. Virol. 69, 1725-1730. 

CarriLto, C., Dopazo, J., Moya, A., GONZALEZ, N., MARTINEZ, M. A., 
Saiz, J. C., and SOBRINO, F. (1990). Comparison of vaccine strains 
and the virus causing the 1987 foot-and-mouth disease outbreak 
in Spain: epizootiological analysis. Virus Res. 15, 45-56. 

CASCONE, P. J., CARPENTER, C. D., LI, X. H., and Simon, A. E. (1990). 
Recombination between satellite RNAs of turnip crinkle virus. 
EMBO J. 9, 1709-1715. 

Correa, |. |., GEBAUER, F., BULLIDO, M. J., SuNié, C., Baay, M. F. D., 
ZWAAGSTRA, K. A., PostHumus, W. P. A., LENSTRA, J. A., and EN- 
JUANES, L. (1990). Localization of antigenic sites of the E2 glyco- 
protein of transmissible gastroenteritis coronavirus. /. Gen. Virol. 
71, 271-279. 

Cox, E., HooYBERGHS, J., and PENSAERT, M. B. (1990a). Sites of repli- 
cation of a porcine respiratory coronavirus related to transmissible 
gastroenteritis virus. Res. Vet. Sci. 48, 165-169. 

Cox, E., PENSAERT, M. B., CALLEBAUT, P., and VAN Deun, K. (1990b). 
Intestinal replication of a porcine respiratory coronavirus closely 
related antigenically to the enteric transmissible gastroenteritis 
virus. Vet. Microbiol. 23, 237-243. 

De Groot, R. J., Luyties, W., Horzinek, M. C., VAN DER ZEUST, 
B.A. M., SPAAN, W. J. M., and LENSTRA, J. A. (1987). Evidence for a 
coiled-coil structure in the spike protein of coronaviruses. J. Mo/. 
Biol, 196, 963-966. 

DELMAS, B., RASSCHAERT, D., GoDET, M., GELFI, J., and LAUDE, H. 
(1990). Four major antigenic sites of the coronavirus transmissible 
gastroenteritis virus are located on the amino-terminal half of spike 
protein. 1. Gen. Virol. 71, 1313-1323. 

Dominao, E., and HOLLAND, J. J. (1988). High error rates, population 
equilibrium and evolution of RNA replication systems. /n ‘‘RNA 
Genetics" (E. Domingo, J.J. Holland, and P. Ahlquist, Eds.}, Vol. 3, 
pp. 3-36. CRC Press, Boca Raton, FL. 

Dove, L. P., and HUTCHINGS, L. M. (1946). Atransmissible gastroen- 
teritis in pigs. / Am. Vet. Med. Assoc. 108, 257-259. 

DureT, C., BRUN, A., GUILMOTO, H., and DAUVERGNE, M. (1988). lsole- 
ment, identification et pouvoir pathogéne chez le pore d’un corona- 
virus apparenté au virus de la gastro-entérite transmissible. Rec. 
Méd. Vét. 164, 221-226. 

EFRON, B. (1982). ‘‘The Jackknife, the Bootstrap and Other Resam- 
pling Plans.” Society for Industrial and Applied Mathematics, Phila- 
delphia. 

ENJUANES, L., and VAN DER ZeusT, B. A. M. (1992). Molecuiar basis of 
transmissible gastroenteritis coronavirus (TGEV) epidemiology. /n 
“Coronaviruses”’ (S. G. Siddell, Ed.), Plenum, New York. 

FELSENSTEIN, J. (1985). Confidence limits on phylogenies: An ap- 
proach using the bootstrap. Evolution 39, 783-791. 

FELSENSTEIN, J. (1990). ‘‘PHYLIP Manual Version 3.3." Editors. Uni- 
versity Herbarium. University of California, Berkeley, California. 


EVOLUTION AND TROPISM OF TGE CORONAVIRUSES 105 


FICHoT, O., and GirarD, M. (1990). An improved method for se- 
quencing of RNA templates. Nucleic Acids Res. 18, 6162. 

FitcH, W. M., and MARGOLIASH, E. (1967). Construction of phylogen- 
etic trees. Science 155, 279-284. 

FURUUCHI, S., SHimizu, Y., and KumMAGal, T. (1976). Vaccination of 
pigs with an attenuated strain of transmissible gastroenteritis 
virus. Am. J. Vet. Res. 37, 1401-1404. 

Garwes, D. J., Lucas, M. H., Hiagins, D. A., Pike, B. V., and 
CARTWRIGHT, S. F. (1978). Antigenicity of structural components 
fram porcine transmissible gastroenteritis virus. Vet. Microbiol. 3, 
179-190. 

Garwes, D. J., STEWART, F., CARTWRIGHT, S. F., and BROWN, |. (1988). 
Differentiation of porcine coronavirus from transmissible gas- 
troenteritis virus. Vet. Rec. 122, 86-87. 

GEBAUER, F., PostHumus, W. A. P., Correa, |., SUNE, C., SANCHEZ, 
C. M., SMERDOU, C., LENSTRA, J. A., MELOEN, R., and ENJUANES, L. 
(1991). Residues involved in the formation of the antigenic sites of 
the S protein of transmissible gastroenteritis coronavirus. Virology 
183, 225-238. 

Jacoss, L., DE GROOT, R., VAN DER Zeust, B. A. M., HoRZINEK, M. C., 
and SpaaNn, W. (1987). The nucleotide sequence of the peplomer 
gene of porcine transmissible gastroenteritis virus (TGEV): Com- 
parison with the sequence of the peplomer protein of feline in- 
fectious peritonitis virus (FIPV). Vir. Res. 8, 363-371. 

JUKEs, T. H., and Cantor, C. R. (1969). Evolution of protein mole- 
cules. /n ‘‘Mammalian Protein Metabolism’’ (H. N. Munro, Ed.), 
pp. 21-132. Academic Press, New York. 

Kimura, M. (1983). “The Neutral Theory of Molecular Evolution.” 
Cambridge Univ. Press, London. 

Kina, A. M. Q. (1988). Genetic recombination in positive strand RNA 
viruses. /n ‘RNA Genetics” (E. Domingo, J.J. Holland, and P. Ahl- 
quist, Eds.), Vol. 2, pp. 149-165. CRC Press. Boca Raton, FL. 

Lal, M. M. (1992). RNA recombination in animal and plant viruses. 
Microbiol. Rev. 56, 61-79. 

LAEMMLI, U. K. (1970). Cleavage of structural proteins during the 
assembly of the head of bacteriophage T4. Nature 227, 680-685. 

LEVINE, A. J. (1984). Viruses and differentiation: The molecular basis 
of viral tissue tropisms. /n ‘‘Concepts in Viral Pathogenesis” (A. L. 
Notkins and M. B. A. Oldstone, Eds.), pp. 130-134. Springer-Ver- 
lag, New York. 

McC LurkIN, A. W., and NORMAN, J. O. (1966). Studies on transmissi- 
ble gastroenteritis of swine. Il. Selected characteristics of a cyto- 
pathogenic virus common to five isolates of transmissible gas- 
troenteritis. Can. /. Comp. Med. Vet. Sci. 30, 190-198. 

Monroe, S. S., and SCHLESINGER, S. (1983). RNAs from two indepen- 
dently isolated defective interfering particles of Sindbis virus con- 
tain a cellular tRNA sequence at their 5’ ends. Proc. Natl. Acad. 
Sci. USA 80, 3279-3283. 

PENSAERT, M., CALLEBAUT, P., and VeRGOTE, J. (1986). Isolation of a 
porcine respiratory, non-enteric coronavirus related to transmissi- 
ble gastroenteritis. Vet. Q. 8, 257-260. 


RAFFO, A. J., and DAwson, W. O. (1991). Construction of Tobacco 
mosaic virus subgenomic replicons that are replicated and spread 
systemically in tobacco plants. Virology 184, 277-289. 

RasSCHéERT, D., and LAupbe, H. (1987). The predicted primary struc- 
ture of the pepiomer protein E2 of the porcine coronavirus trans- 
missible gastroenteritis virus. J. Gen. Virol. 68, 1883-1890. 

RASSCHAERT, D., DUARTE, M., and LAUDE, H. (1990). Porcine respira- 
tory coronavirus differs from transmissible gastroenteritis virus by 
a few genomic deletions. /. Gen. Virol. 71, 2599-2607. 

Saitou, N. M., and Nei, M. (1987). The neighbor-joining method: A 
new method for reconstructing phylogenetic trees. Mo/. Biol. Evol. 
4, 406-425. 

SANCHEZ, C. M., JiMENEz, G., LAvIADA, M. D., Correa, |., SUNE, C., 
Mania, J. B., GEBAUER, F., SMERDOU, C., CALLEBAUT, P., ESCRIBANO, 
J.M., and ENJUANEs, L. (1990). Antigenic homology among corona- 
viruses related to transmissible gastroenteritis virus. Virology 174, 
410~417. 

SANGER, F., NICKLEN, S., and COULSON, A. R. (1977). DNA sequenc- 
ing with chain-terminating inhibitors. Proc. Nat’. Acad. Sci. USA 
74, 5463-5467. 

SIDDELL, S. G., WeGeE, H., and Ter MEUuLEN, V. (1982). The structure 
and replication of coronaviruses. Curr. Topics Microbiol. Immunol. 
99, 131-163. 

SoKAL, R. R., and ROHLe, F. J. (1981). ‘‘Biometry.’’ Freeman, New 
York. 

Sourois, J., and Nel, M. (1988). Relative efficiencies of the maximum 
parsimony and distance-matrix methods in obtaining the correct 
phylogenetic tree. Mo/. Biol. Evol. 45, 298-311. 

SPAAN, W., CAVANAGH, D., and Horzinek, M. C. (1988). Coronavi- 
ruses: Structure and genome expression. /. Gen. Virol. 69, 2939- 
2952. 

SUNE, C., JIMENEZ, G., CorREA, |., BULLIDO, M. J., GEBAUER, F., SMER- 
pou, C., and ENJUANES, L. (1990). Mechanisms of transmissible 
gastroenteritis coronavirus neutralization. Virology 177, 559-569. 

UNDERDAHL, N. R., Mesus, C. A., STAIR, E. L., RHoDes, M. B., MCGILL, 
L. D., and TwieHaus, M. J. (1974). Isolation of transmissible gas- 
troenteritis virus from lungs of market-weight swine. Am. J. Vet. 
Res. 35, 1209-1216. 

WESLEY, R. D. (1990). Nucleotide sequence of the E2-pepiomer pro- 
tein gene and partial nucleotide sequence of the upstream poly- 
merase gene of transmissible gastroenteritis virus (Miller strain). 
Aav. Exp. Med. Biol. 276, 301-306. 

Westey, R. D., Woops, R. D., and CHEUNG, A. K. (1990a). Genetic 
basis for the pathogenesis of transmissible gastroenteritis virus. J. 
Virol. 64, 4761-4766. 

Wes ey, R. D., Woops, R. D., HILL, H. T., and Biwer, J. D. (1990b). 
Evidence for a porcine respiratory coronavirus, antigenically simi- 
lar to transmissible gastroenteritis virus, in the United States. J. 
Vet. Diagn. Invest. 2, 312-317. 

Wes ey, R. D., Woops, R. D., and CHeuna, A. K. (1991). Genetic 
analysis of porcine respiratory coronavirus, an attenuated variant 
of transmissible gastroenteritis virus. /. Virol. 65, 3369-3373. 


